Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Front Immunol ; 15: 1364954, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38510238

RESUMEN

Introduction: Inflammatory conditions in patients have various causes and require different treatments. Bacterial infections are treated with antibiotics, while these medications are ineffective against viral infections. Autoimmune diseases and graft-versus-host disease (GVHD) after allogeneic stem cell transplantation, require immunosuppressive therapies such as glucocorticoids, which may be contraindicated in other inflammatory states. In this study, we employ a combination of straightforward blood tests to devise an explainable artificial intelligence (XAI) for distinguishing between bacterial infections, viral infections, and autoimmune diseases/graft-versus-host disease. Patients and methods: We analysed peripheral blood from 80 patients with inflammatory conditions and 38 controls. Complete blood count, CRP analysis, and a rapid flow cytometric test for myeloid activation markers CD169, CD64, and HLA-DR were utilized. A two-step XAI distinguished firstly with C5.0 rules pruned by ABC analysis between controls and inflammatory conditions and secondly between the types of inflammatory conditions with a new bivariate decision tree using the Simpson impurity function. Results: Inflammatory conditions were distinguished using an XAI, achieving an overall accuracy of 81.0% (95%CI 72 - 87%). Bacterial infection (N = 30), viral infection (N = 26), and autoimmune diseases/GVHD (N = 24) were differentiated with accuracies of 90.3%, 80.0%, and 79.0%, respectively. The most critical parameter for distinguishing between controls and inflammatory conditions was the expression of CD64 on neutrophils. Monocyte count and expression of CD169 were most crucial for the classification within the inflammatory conditions. Conclusion: Treatment decisions for inflammatory conditions can be effectively guided by XAI rules, straightforward to implement and based on promptly acquired blood parameters.


Asunto(s)
Enfermedades Autoinmunes , Infecciones Bacterianas , Enfermedad Injerto contra Huésped , Virosis , Humanos , Inteligencia Artificial , Enfermedades Autoinmunes/diagnóstico , Enfermedades Autoinmunes/terapia
2.
Curr Oncol ; 30(2): 1903-1915, 2023 02 04.
Artículo en Inglés | MEDLINE | ID: mdl-36826109

RESUMEN

BACKGROUND: The International Prognostic Index (IPI) is applied to predict the outcome of chronic lymphocytic leukemia (CLL) with five prognostic factors, including genetic analysis. We investigated whether multiparameter flow cytometry (MPFC) data of CLL samples could predict the outcome by methods of explainable artificial intelligence (XAI). Further, XAI should explain the results based on distinctive cell populations in MPFC dot plots. METHODS: We analyzed MPFC data from the peripheral blood of 157 patients with CLL. The ALPODS XAI algorithm was used to identify cell populations that were predictive of inferior outcomes (death, failure of first-line treatment). The diagnostic ability of each XAI population was evaluated with receiver operating characteristic (ROC) curves. RESULTS: ALPODS defined 17 populations with higher ability than the CLL-IPI to classify clinical outcomes (ROC: area under curve (AUC) 0.95 vs. 0.78). The best single classifier was an XAI population consisting of CD4+ T cells (AUC 0.78; 95% CI 0.70-0.86; p < 0.0001). Patients with low CD4+ T cells had an inferior outcome. The addition of the CD4+ T-cell population enhanced the predictive ability of the CLL-IPI (AUC 0.83; 95% CI 0.77-0.90; p < 0.0001). CONCLUSIONS: The ALPODS XAI algorithm detected highly predictive cell populations in CLL that may be able to refine conventional prognostic scores such as IPI.


Asunto(s)
Leucemia Linfocítica Crónica de Células B , Humanos , Pronóstico , Leucemia Linfocítica Crónica de Células B/tratamiento farmacológico , Inteligencia Artificial , Algoritmos
3.
Cytometry A ; 103(4): 304-312, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36030398

RESUMEN

Minimal residual disease (MRD) detection is a strong predictor for survival and relapse in acute myeloid leukemia (AML). MRD can be either determined by molecular assessment strategies or via multiparameter flow cytometry. The degree of bone marrow (BM) dilution with peripheral blood (PB) increases with aspiration volume causing consecutive underestimation of the residual AML blast amount. In order to prevent false-negative MRD results, we developed Cinderella, a simple automated method for one-tube simultaneous measurement of hemodilution in BM samples and MRD level. The explainable artificial intelligence (XAI) Cinderella was trained and validated with the digital raw data of a flow cytometric "8-color" AML-MRD antibody panel in 126 BM and 23 PB samples from 35 patients. Cinderella predicted PB dilution with high accordance compared to the results of the Holdrinet formula (Pearson's correlation coefficient r = 0.94, R2  = 0.89, p < 0.001). Unlike conventional neuronal networks Cinderella calculated the distributions of 12 different cell populations that were assigned to true hematopoietic counterparts as a human in the loop (HIL) approach. Besides characteristic BM cells such as myelocytes and myeloid progenitor cells the XAI identified discriminating populations, which were not specific for BM or PB (e.g., T cell/NK cell subpopulations and CD45 negative cells) and considered their frequency differences. Thus, Cinderella represents a HIL-XAI algorithm capable to calculate the degree of hemodilution in BM samples with an AML MRD immunophenotype panel. It is explicable, transparent, and paves a simple way to prevent false negative MRD reports.


Asunto(s)
Médula Ósea , Leucemia Mieloide Aguda , Humanos , Neoplasia Residual/diagnóstico , Inteligencia Artificial , Hemodilución
4.
Bioengineering (Basel) ; 9(11)2022 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-36354555

RESUMEN

"Big omics data" provoke the challenge of extracting meaningful information with clinical benefit. Here, we propose a two-step approach, an initial unsupervised inspection of the structure of the high dimensional data followed by supervised analysis of gene expression levels, to reconstruct the surface patterns on different subtypes of acute myeloid leukemia (AML). First, Bayesian methodology was used, focusing on surface molecules encoded by cluster of differentiation (CD) genes to assess whether AML is a homogeneous group or segregates into clusters. Gene expressions of 390 patient samples measured using microarray technology and 150 samples measured via RNA-Seq were compared. Beyond acute promyelocytic leukemia (APL), a well-known AML subentity, the remaining AML samples were separated into two distinct subgroups. Next, we investigated which CD molecules would best distinguish each AML subgroup against APL, and validated discriminative molecules of both datasets by searching the scientific literature. Surprisingly, a comparison of both omics analyses revealed that CD339 was the only overlapping gene differentially regulated in APL and other AML subtypes. In summary, our two-step approach for gene expression analysis revealed two previously unknown subgroup distinctions in AML based on surface molecule expression, which may guide the differentiation of subentities in a given clinical-diagnostic context.

5.
Data Brief ; 43: 108382, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35799850

RESUMEN

Three different Flow Cytometry datasets consisting of diagnostic samples of either peripheral blood (pB) or bone marrow (BM) from patients without any sign of bone marrow disease at two different health care centers are provided. In Flow Cytometry, each cell rapidly passes through a laser beam one by one, and two light scatter, and eight surface parameters of more than 100.000 cells are measured per sample of each patient. The technology swiftly characterizes cells of the immune system at the single-cell level based on antigens presented on the cell surface that are targeted by a set of fluorochrome-conjugated antibodies. The first dataset consists of N=14 sample files measured in Marburg and the second dataset of N=44 data files measured in Dresden, of which half are BM samples and half are pB samples. The third dataset contains N=25 healthy bone marrow samples and N=25 leukemia bone marrow samples measured in Marburg. The data has been scaled to log between zero and six and used to identify cell populations that are simultaneously meaningful to the clinician and relevant to the distinction of pB vs BM, and BM vs leukemia. Explainable artificial intelligence methods should distinguish these samples and provide meaningful explanations for the classification without taking more than several hours to compute their results. The data described in this article are available in Mendeley Data [1].

6.
Sci Rep ; 11(1): 20245, 2021 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-34615989
7.
Sci Rep ; 11(1): 18988, 2021 09 23.
Artículo en Inglés | MEDLINE | ID: mdl-34556686

RESUMEN

Benchmark datasets with predefined cluster structures and high-dimensional biomedical datasets outline the challenges of cluster analysis: clustering algorithms are limited in their clustering ability in the presence of clusters defining distance-based structures resulting in a biased clustering solution. Data sets might not have cluster structures. Clustering yields arbitrary labels and often depends on the trial, leading to varying results. Moreover, recent research indicated that all partition comparison measures can yield the same results for different clustering solutions. Consequently, algorithm selection and parameter optimization by unsupervised quality measures (QM) are always biased and misleading. Only if the predefined structures happen to meet the particular clustering criterion and QM, can the clusters be recovered. Results are presented based on 41 open-source algorithms which are particularly useful in biomedical scenarios. Furthermore, comparative analysis with mirrored density plots provides a significantly more detailed benchmark than that with the typically used box plots or violin plots.

8.
MethodsX ; 7: 101093, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33134096

RESUMEN

Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson-Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset.•The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected.•The topographic map can uncover the absence of distance-based structures.•The topographic map reveals the number of clusters in a dataset as the number of valleys.

9.
PLoS One ; 15(10): e0238835, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33052923

RESUMEN

One aim of data mining is the identification of interesting structures in data. For better analytical results, the basic properties of an empirical distribution, such as skewness and eventual clipping, i.e. hard limits in value ranges, need to be assessed. Of particular interest is the question of whether the data originate from one process or contain subsets related to different states of the data producing process. Data visualization tools should deliver a clear picture of the univariate probability density distribution (PDF) for each feature. Visualization tools for PDFs typically use kernel density estimates and include both the classical histogram, as well as the modern tools like ridgeline plots, bean plots and violin plots. If density estimation parameters remain in a default setting, conventional methods pose several problems when visualizing the PDF of uniform, multimodal, skewed distributions and distributions with clipped data, For that reason, a new visualization tool called the mirrored density plot (MD plot), which is specifically designed to discover interesting structures in continuous features, is proposed. The MD plot does not require adjusting any parameters of density estimation, which is what may make the use of this plot compelling particularly to non-experts. The visualization tools in question are evaluated against statistical tests with regard to typical challenges of explorative distribution analysis. The results of the evaluation are presented using bimodal Gaussian, skewed distributions and several features with already published PDFs. In an exploratory data analysis of 12 features describing quarterly financial statements, when statistical testing poses a great difficulty, only the MD plots can identify the structure of their PDFs. In sum, the MD plot outperforms the above mentioned methods.


Asunto(s)
Visualización de Datos , Algoritmos , Interpretación Estadística de Datos , Minería de Datos , Humanos , Método de Montecarlo , Distribución Normal , Probabilidad , Programas Informáticos , Procesos Estocásticos
10.
Cytometry B Clin Cytom ; 98(6): 476-482, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32716606

RESUMEN

BACKGROUND: The Matutes score (MS) was proposed to differentiate chronic lymphocytic leukemia (CLL) from other B-cell non-Hodgkin lymphomas (B-NHLs). However, ambiguous immunophenotypes are common and remain a diagnostic challenge. Therefore, we evaluated the diagnostic benefit of measuring CD200 and CD43 expression together with the standard MS antigens. METHODS: 138 lymphoma patient samples and a validation cohort of 138 additive samples were classified according to the standard MS and further assigned with one or two additional points, for high CD200 and/or CD43 expression levels. The "classical" MS and the "Matutes score-extended" (MS-e) were categorized as high (4-5/6-7), intermediate (2-3/4-5), and low (0-1/0-3). Samples were reclassified into the MS-e with focus on ambiguous cases with an intermediate "classical" MS. RESULTS: A total of 35 of 138 (25.4%) patient samples were assigned to the intermediate MS group and confirmed by histopathological reports as CLL (14/40.0%) and B-NHLs other than CLL (21/60%). MS-e analysis identified 13 of 14 (92.9%) of CLL cases (MS-e 4-5) and 18/21 (85.7%) non-CLL cases (MS-e ≤ 3) correctly. Overall, the sensitivity of the CLL diagnosis was significantly increased by application of MS-e compared to the "classical" MS (98.8% vs. 82.7%; p = 0.0009), while specificity of both methods was almost equal (94.7% vs. 98.3%; p = 0.4795). Of note, sole measurement of CD43 and CD200 on B-cells sufficiently differentiated CLL from non-CLL with a test accuracy superior to the "classical" MS (F1 score 96.2 vs. 93.6). CONCLUSION: CD200 and CD43 have a high informative value in diagnostic immunophenotyping and facilitate the separation of CLL from other B-NHLs particularly in ambiguous cases.


Asunto(s)
Antígenos CD/inmunología , Leucemia Linfocítica Crónica de Células B/diagnóstico , Leucosialina/inmunología , Linfoma de Células B/diagnóstico , Antígenos CD/aislamiento & purificación , Linfocitos B/inmunología , Linfocitos B/patología , Biomarcadores de Tumor/inmunología , Diferenciación Celular/genética , Diferenciación Celular/inmunología , Diagnóstico Diferencial , Femenino , Regulación de la Expresión Génica , Humanos , Inmunofenotipificación/métodos , Leucemia Linfocítica Crónica de Células B/inmunología , Leucemia Linfocítica Crónica de Células B/patología , Leucosialina/aislamiento & purificación , Linfoma de Células B/inmunología , Linfoma de Células B/patología , Masculino
11.
Data Brief ; 30: 105501, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32373681

RESUMEN

The Fundamental Clustering Problems Suite (FCPS) offers a variety of clustering challenges that any algorithm should be able to handle given real-world data. The FCPS consists of datasets with known a priori classifications that are to be reproduced by the algorithm. The datasets are intentionally created to be visualized in two or three dimensions under the hypothesis that objects can be grouped unambiguously by the human eye. Each dataset represents a certain problem that can be solved by known clustering algorithms with varying success. In the R package "Fundamental Clustering Problems Suite" on CRAN, user-defined sample sizes can be drawn for the FCPS. Additionally, the distances of two high-dimensional datasets called Leukemia and Tetragonula are provided here. This collection is useful for investigating the shortcomings of clustering algorithms and the limitations of dimensionality reduction methods in the case of three-dimensional or higher datasets. This article is a simultaneous co-submission with Swarm Intelligence for Self-Organized Clustering [1].

12.
Pain ; 159(7): 1366-1381, 2018 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-29596157

RESUMEN

Heat pain and its modulation by capsaicin varies among subjects in experimental and clinical settings. A plausible cause is a genetic component, of which TRPV1 ion channels, by their response to both heat and capsaicin, are primary candidates. However, TRPA1 channels can heterodimerize with TRPV1 channels and carry genetic variants reported to modulate heat pain sensitivity. To address the role of these candidate genes in capsaicin-induced hypersensitization to heat, pain thresholds acquired before and after topical application of capsaicin and TRPA1/TRPV1 exomic sequences derived by next-generation sequencing were assessed in n = 75 healthy volunteers and the genetic information comprised 278 loci. Gaussian mixture modeling indicated 2 phenotype groups with high or low capsaicin-induced hypersensitization to heat. Unsupervised machine learning implemented as swarm-based clustering hinted at differences in the genetic pattern between these phenotype groups. Several methods of supervised machine learning implemented as random forests, adaptive boosting, k-nearest neighbors, naive Bayes, support vector machines, and for comparison, binary logistic regression predicted the phenotype group association consistently better when based on the observed genotypes than when using a random permutation of the exomic sequences. Of note, TRPA1 variants were more important for correct phenotype group association than TRPV1 variants. This indicates a role of the TRPA1 and TRPV1 next-generation sequencing-based genetic pattern in the modulation of the individual response to heat-related pain phenotypes. When considering earlier evidence that topical capsaicin can induce neuropathy-like quantitative sensory testing patterns in healthy subjects, implications for future analgesic treatments with transient receptor potential inhibitors arise.


Asunto(s)
Aprendizaje Automático , Umbral del Dolor/fisiología , Dolor/genética , Canal Catiónico TRPA1/genética , Canales Catiónicos TRPV/genética , Capsaicina/farmacología , Estudios de Asociación Genética , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Calor , Humanos , Umbral del Dolor/efectos de los fármacos
13.
Sci Rep ; 6: 31536, 2016 08 30.
Artículo en Inglés | MEDLINE | ID: mdl-27572284

RESUMEN

High-frequency, in-situ monitoring provides large environmental datasets. These datasets will likely bring new insights in landscape functioning and process scale understanding. However, tailoring data analysis methods is necessary. Here, we detach our analysis from the usual temporal analysis performed in hydrology to determine if it is possible to infer general rules regarding hydrochemistry from available large datasets. We combined a 2-year in-stream nitrate concentration time series (time resolution of 15 min) with concurrent hydrological, meteorological and soil moisture data. We removed the low-frequency variations through low-pass filtering, which suppressed seasonality. We then analyzed the high-frequency variability component using Pareto Density Estimation, which to our knowledge has not been applied to hydrology. The resulting distribution of nitrate concentrations revealed three normally distributed modes: low, medium and high. Studying the environmental conditions for each mode revealed the main control of nitrate concentration: the saturation state of the riparian zone. We found low nitrate concentrations under conditions of hydrological connectivity and dominant denitrifying biological processes, and we found high nitrate concentrations under hydrological recession conditions and dominant nitrifying biological processes. These results generalize our understanding of hydro-biogeochemical nitrate flux controls and bring useful information to the development of nitrogen process-based models at the landscape scale.


Asunto(s)
Bases de Datos Factuales , Monitoreo del Ambiente , Nitratos/análisis , Ríos/química
14.
Int J Mol Sci ; 16(10): 25897-911, 2015 Oct 28.
Artículo en Inglés | MEDLINE | ID: mdl-26516852

RESUMEN

Biomedical data obtained during cell experiments, laboratory animal research, or human studies often display a complex distribution. Statistical identification of subgroups in research data poses an analytical challenge. Here were introduce an interactive R-based bioinformatics tool, called "AdaptGauss". It enables a valid identification of a biologically-meaningful multimodal structure in the data by fitting a Gaussian mixture model (GMM) to the data. The interface allows a supervised selection of the number of subgroups. This enables the expectation maximization (EM) algorithm to adapt more complex GMM than usually observed with a noninteractive approach. Interactively fitting a GMM to heat pain threshold data acquired from human volunteers revealed a distribution pattern with four Gaussian modes located at temperatures of 32.3, 37.2, 41.4, and 45.4 °C. Noninteractive fitting was unable to identify a meaningful data structure. Obtained results are compatible with known activity temperatures of different TRP ion channels suggesting the mechanistic contribution of different heat sensors to the perception of thermal pain. Thus, sophisticated analysis of the modal structure of biomedical data provides a basis for the mechanistic interpretation of the observations. As it may reflect the involvement of different TRP thermosensory ion channels, the analysis provides a starting point for hypothesis-driven laboratory experiments.


Asunto(s)
Calor , Dolor Nociceptivo/metabolismo , Umbral del Dolor , Sensación Térmica , Adolescente , Adulto , Algoritmos , Femenino , Humanos , Masculino , Modelos Neurológicos , Dolor Nociceptivo/fisiopatología , Canales de Potencial de Receptor Transitorio/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...